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Abstract 

Inside individual cells, expression of genes is inherently stochastic and manifests as cell-to- 
cell variability or noise in protein copy numbers. Since proteins half-lives can be comparable 
to the cell-cycle length, randomness in cell-division times generates additional intercellular 
variability in protein levels. Moreover, as many mRNA/protein species are expressed at 
low-copy numbers, errors incurred in partitioning of molecules between the mother and 
daughter cells are significant. We derive analytical formulas for the total noise in protein 
levels for a general class of cell-division time and partitioning error distributions. Using 
a novel hybrid approach the total noise is decomposed into components arising from i) 
stochastic expression; ii) partitioning errors at the time of cell-division and iii) random 
cell-division events. These formulas reveal that random cell-division times not only gener¬ 
ate additional extrinsic noise but also critically affect the mean protein copy numbers and 
intrinsic noise components. Counter intuitively, in some parameter regimes noise in pro¬ 
tein levels can decrease as cell-division times become more stochastic. Computations are 
extended to consider genome duplication, where the gene dosage is increased by two-fold at 
a random point in the cell-cycle. We systematically investigate how the timing of genome 
duplication influences different protein noise components. Intriguingly, results show that 
noise contribution from stochastic expression is minimized at an optimal genome duplica¬ 
tion time. Our theoretical results motivate new experimental methods for decomposing 
protein noise levels from single-cell expression data. Characterizing the contributions of 
individual noise mechanisms will lead to precise estimates of gene expression parameters 
and techniques for altering stochasticity to change phenotype of individual cells. 
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1 Introduction 


The level of a protein can deviate considerably from cell-to-cell, in spite of the fact that cells are 
genetically-identical and are in the same extracellular environment 0i This intercellular variation 
or noise in protein counts has been implicated in diverse processes such as corrupting functioning of 
gene networks [4}j6], driving probabilistic cell-fate decisions (7 - 12 , buffering cell populations from hostile 


changes in the environment 13 ■ 16 
lus 


and causing clonal cells to respond differently to the same stimu- 


17 -19 . An important source of noise driving random fluctuations in protein levels is stochastic gene 


expression due to the inherent probabilistic nature of biochemical processes |20f|23| . Recent experimental 
studies have uncovered additional noise sources that affect protein copy numbers. For example, the time 
take to complete cell-cycle (i.e., time between two successive cell-division events) has been observed to 
be stochastic across organisms 24-32 . Given that many proteins/mRNAs are present inside cells at 


low-copy numbers, errors incurred in partitioning of molecules between the mother and daughter cells 
are significant 33 -35 . Finally, the time at which a particular gene of interest is duplicated can also 


vary between cells 36 371. We investigate how such noise sources in the cell-cycle process combine with 


stochastic gene expression to generate intercellular variability in protein copy numbers (Fig.l). 
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Figure 1. Sample trajectory of the protein level in a single cell with different sources of 
noise. Stochastically expressed proteins accumulate within the cell at a certain rate. At a random point 
in the cell-cycle, gene-duplication results in an increase in production rate. Stochastic cell-division events 
lead to random partitioning of protein molecules between the mother and daughter cells with each cell 
receiving, on average, half the number of proteins in the mother cell just before division. The steady-state 
protein copy number distribution obtained from a large number of trajectories is shown on the right. The 
total noise in the protein level, as measured by the Coefficient of Variation (CV) squared can be broken 
into contributions from individual noise mechanisms. 


Prior studies that quantify the effects of cell-division on the protein noise level have been restricted 
to specific cases. For example, noise computations have been done in stochastic gene expression mod¬ 
els, where cell-divisions occur at deterministic time intervals 33 38 39 . Recently, we have analyzed a 

Building up on this work. 


deterministic model of gene expression with random cell-division events 40 


we formulate a mathematical model that couples stochastic expression of a stable protein with random 
cell-division events that follow an arbitrary probability distribution function. Moreover, at the time of 
cell-division, proteins are randomly partitioned between the mother and daughter cells based on a gen- 












































3 


eral framework that allows the partitioning errors to be higher or lower than as predicted by binomial 
partitioning. For this class of models, we derive an exact analytical formula for the protein noise level 
as quantified by the steady-state Coefficient of Variation (CV) squared. This formula is further decom¬ 
posed into individual components representing contributions from different noise sources. A systematic 
investigation of this formula leads to novel insights, such as identification of regimes where increasing 
randomness in the timing of cell-division events decreases the protein noise level. 

Next, we extend the above model to include genome duplication events that increase the gene’s 


transcription rate by two-fold (corresponding to doubling of gene dosage) prior to cell-division 36,41 


To our knowledge, this is the first study integrating randomness in the genome duplication process with 
stochastic gene expression. An exact formula for the protein noise level is derived for this extended 
model and used to investigate how the timing of duplication affects different noise components. Counter 
intuitively, results show that doubling of the transcription rate within the cell-cycle can lead to smaller 
fluctuations in protein levels as compared to a constant transcription rate through out the cell-cycle. 
Finally, we discuss how formulas obtained in this study can be used to infer parameters and characterize 
the gene expression process from single-cell studies. 


2 Coupling gene expression to cell-division 


We consider the standard model of stochastic gene expression 42 43 , where mRNAs are transcribed 
at exponentially distributed time intervals from a constitutive gene with rate kx- For the time being, 
we exclude genome duplication and the transcription rate is fixed throughout the cell-cycle. Assuming 
short-lived mRNAs, each transcription event results in a burst of proteins 43 -45 . The corresponding 
jump in protein levels is shown as 

x{t)^x{t) + B, (1) 

where x{t) is the protein population count in the mother cell at time t, B is a, random burst size drawn 
from a positively-valued distribution and represents the number of protein molecules synthesized in a 
single-mRNA lifetime. Motivated by observations in E. coli and mammalian cells, where many proteins 
have half-lives considerably longer than the cell-doubling time, we assume a stable protein with no active 
degradation 46 -48 . Thus, proteins accumulate within the cell till the time of cell-division, at which 


point they are randomly partitioned between the mother and daughter cells. 
Let cell-division events occur at times ts, sS{l,2,...}. The cell-cycle time 

T :=ts- ts-i, 


( 2 ) 


follows an arbitrary positively-valued probability distribution with the following mean and coefficient of 
variation (CV) squared 


{T) = {C-ts-i), CVS = 


(T^) - (T)^ 

lj)2 


( 3 ) 


where (.) denotes expected value through out this paper. The random change in x(t) during cell-division 
is given by 

x(ts) x+(ts), (4) 

where x(ts) and x+(ts) denote the protein levels in the mother cell just before and after division, respec¬ 
tively. Conditioned on x{ts), x+{ts) is assumed to have the following statistics 




':(ts) 


xl(ts) - {x+its)y^ 


yts)) = 


yts 


( 5 ) 
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The first equation implies symmetric division, i.e., on average the mother cell inherits half the number 
protein molecules just before division. The second equation in (§ describes the variance of {x+{ts)) and 
quantifies the error in partitioning of molecules through the non-negative parameter a. For example, 
a = 0 represents deterministic partitioning where x+(ts) = x{ta)l2 with probability equal to one. A 
more realistic model for partitioning is each molecule having an equal probability of being in the mother 
or daughter cell 49 -51 . This result in a binomial distribution for x+(ts) 


Probability{x+(ts) = j|a^(4)} = 


x{ta)\ 


j G {0, l,...,x(ts)}, 


( 6 ) 


j\{x(ts)-j)\ 

and corresponds to a = 1 in ([^. Interestingly, recent studies have shown that partitioning of proteins 
that form clusters or multimers can result in a > 1 in ([^, i.e., partitioning errors are much higher than 
as predicted by the binomial distribution 33 39 . In contrast, if molecules push each other to opposite 


poles of the cell, then the partitioning errors will be smaller than as predicted by § and a < 1. 

The model with all the different noise mechanisms (stochastic expression; random cell-division events 
and partitioning errors) is illustrated in Fig. 2A and referred to as the full model. We also introduce 
two additional hybrid models [^|^, where protein production and partitioning are considered in their 
deterministic limit (Fig. 2B-C). Note that unlike the full model, where x{t) takes non-negative integer 
values, x{t) is continuous in the hybrid models. We will use these hybrid models for decomposing the 
protein noise level obtained from the full model into individual components representing contributions 
from different noise sources. However, before computing the noise, we first determine the average number 
of proteins as a function of the cell-cycle time distribution. 


A) Stochastic cell cycle, B) Stochastic cell cycle C) Stochastic cell cycle & partitioning 

partitioning & production Deterministic partitioning & production Deterministic production 



Figure 2. Stochastic models of gene expression with cell-division. Arrows denote stochastic 
events that change the protein level by discrete jumps as shown in Q and Q. The differential equation 
within the circle represents the time evolution of x{t) in between events. A) Model with all the different 
sources of noise: proteins are expressed in stochastic bursts, cell-division occurs at random times, and 
molecules are partitioned between the mother and daughter cells based on ([^. The trivial dynamics 
X = 0 signifies that the protein level is constant in-between stochastic events. B) Hybrid model where 
randomness in cell-division events is the only source of noise. Protein production is modeled determin¬ 
istically through a differential equation and partitioning errors are absent, i.e., a = 0 in ([^. C) Hybrid 
model where noise comes from both cell-division events and partitioning errors. Protein production is 
considered deterministically as in Fig. 2B. Since x(t) is continuous here, x+(ts) has a positively-valued 
continuous distribution with same mean and variance as in ([^ 


3 Computing the average number of protein molecules 

To quantify the steady-state mean protein level we consider the full model illustrated in Fig. 2A. It 
turns out that all the models shown in Fig. 2 are identical in terms of finding {x{t)) and in principle 
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any one of them could have been used. To obtain differential equations describing the time evolution 
of {x{t)) we model the cell-cycle time through a phase-type distribution, which can be represented by a 
continuous-time Markov chain. Phase-type distributions are dense in the class of positively-valued con¬ 
tinuous distributions, i.e., one can always construct a sequence of phase-type distributions that converges 
point wise to a given distribution of interest 


54 . We use this denseness property as a practical tool for 


modeling the cell-cycle time. 


3.1 Cell-cycle time as a phase-type distribution 

We consider a class of phase-type distribution that consists of a mixture of Erlang distributions. Recall 
that an Erlang distribution of order i is the distribution of the sum of i independent and identical 
exponential random variables. The cell-cycle time is assumed to have an Erlang distribution of order 
i with probability pi, i = {1,... ,n} and can be represented by a continuous-time Markov chain with 
states Gij, j = {1,..., i = {1,... ,n} (Fig. 3). Let Bernoulli random variables gij = 1 if the system 
resides in state Gij and 0 otherwise. The probability of transition Gij —>■ in the next infinitesimal 

time interval [t^t + dt) is given by kpijdt, implying that the time spent in each state Gij is exponentially 
distributed with mean 1/k. To summarize, at the start of cell-cycle, a state Gn, i = {1,..., n} is chosen 
with probability pi and cell-division occurs after transitioning through i exponentially distributed steps. 
Based on this formulation, the probability of a cell-division event occurring in the next time interval 
[t, t -I- dt) is given by kpi J2j=i whenever the event occurs, the protein level changes as per Q. 

Finally, the mean and the coefficient of variation squared of the cell-cycle time is obtained as 

0 obtain (x) := limt-).oo{x(t}} as a function of (T) 

Figure 3. A continuous-time Markov chain 
model for the cell-cycle time. The cell- 
cycle time is assumed to follow a mixture of Er¬ 
lang distributions. At the start of cell-cycle, a 
state Gil, i = {1,..., n} is chosen with probabil¬ 
ity Pi. The cell-cycle transitions through states 
Gij, j = {1,..., i} residing for an exponentially 
distributed time with mean 1/k in each state. 
Cell-division occurs after exit from Gu and the 
above process is repeated. 


(T) 

1=1 

in terms of the Markov chain parameters. Our goal is 
and GV^. 



3.2 Time evolution of the mean protein level 


Time evolution of the statistical moments of x{t) can be obtained from the Kolmogorov forward equations 
corresponding to the full model in Fig. 2A combined with the cell-division process described in Fig. 3. 
We refer the reader to 52 55 56 for an introduction to moment dynamics for stochastic and hybrid 


systems. Analysis in Appendix A shows 


d{x) 


kx{B) 



dt 


( 8 ) 










6 


Note that the time-derivative of the mean protein level (first-order moment) is unclosed, in the sense that, 
it depends on the second-order moment (xgij). Typically, approximate closure methods are used to solve 
moments in such cases 52,56 -61 . However, the fact that gij is binary can be exploited to automatically 

(9) 


close moment dynamics. In particular, since gij S {0,1} 

{glx^) = {g^x^). nG{l,2,...} 

for any non-negative integer m. Moreover, as only a single state gij can be 1 at any time 

{9t]9rqx'^) = 0 , \i i ^ r ov j ^ q. 

Using ^ and (10), the time evolution of {xgij) is obtained as 

d{xgii) kx{B)pi k 


dt 

d{xgij) 



kx {B)pi 


xgjj ) - k{xgii), 


- k{xgij) + k{xgi(j_i)), j = {2,...,i} 


( 10 ) 


(11a) 


(11b) 


dt i 

and only depends on (xgij) (see Appendix A). Thus, ^ and © constitute a closed system of linear 
differential equations from which moments can be computed exactly. 

To obtain an analytical formula for the average number of proteins, we start by performing a steady- 
state analysis of ^ that yields 

2kx{B) 



k 


( 12 ) 


where (.) denotes the expected value in the limit t —>■ oo. Using (12), (xgn) is determined from (11a), 
and then all moments {xgij) are obtained recursively by performing a steady-state analysis of (11b) for 
j = {2,..., i}. This analysis results in 


7-r kx{B) 

{xgij) = ^ Pi 


1-h 


(13) 


Using ( 0 , © and the fact that X]”=i 9ij = 1 we obtain the following expression for the mean 

protein level 

kx{B){T){3 + CV^) 


{x) = (xJ2J29ij) = 

\ z=l j=l / i=l j=l 


(14) 


It is important to point that (14) holds irrespective of the complexity, i.e., the number of sates Gij 
used in the phase-type distribution to approximate the cell-cycle time distribution. As expected, {x) 
increases linearly with the average cell-cycle time duration {T) with longer cell-cycles resulting in more 


accumulation of proteins. Consistent with previous finding^ 14) shows that the mean protein level is also 
affected by the randomness in the cell-cycle times {CV^) 


40 


62 


. For example, {x) reduces by 25% as T 
changes from being exponentially distributed (CU^ = 1) to periodic {CV^ = 0) for fixed (T) fixed. Next, 
we determine the noise in protein copy numbers, as quantified by the coefficient of variation squared. 


4 Computing the protein noise level 

Recall that the full model introduced in Fig. 2A has three distinct noise mechanisms. Our strategy for 
computing the protein noise level is to first analyze the model with a single noise source, and then consider 
models with two and three sources. As shown below, this approach provides a systematic dissection of 
the protein noise level into components representing contributions from different mechanisms. 
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4.1 Contribution from randomness in cell-cycle times 

We begin with the model shown in Fig. 2B, where noise comes from a single source - random cell-division 
events. For this model, the time evolution of the second-order moment of the protein copy number is 
obtained as 


d{x^) Zk / 


dt 


X gjj 


(15) 


\i=i 


and depends on third-order moments (x'^gjj) (see Appendix B). Using the approach introduced earlier for 
obtaining the mean protein level, we close moment equations by writing the time evolution of moments 
(x'^gij)- Using Q and ( |l0| ) 


= ‘^K{B){xga) + jPi - k{x‘^g^l), 

d{x^gij) 


\i=i 

^ 2 ^ \ I u/^ 2 ^ 


dt 


= 2k^{B){xg^j) - k{x gtj) + k{x g{t-i)j), j = {2,...,i}- 


(16a) 


(16b) 


Note that the moment dynamics for (x) and (xgij) obtained in the previous section (equations (|^ and 
0) are identical for all the models in Fig. 2, irrespective of whether the noise mechanism is modeled 
deterministically or stochastically. Equations (11), (15) and (16) represent a closed set of linear 


differential equations and their steady-state analysis yields 

kl{Bf{T){Z + CV^)^ , 2kl{B)^ 




Zk 


-Pi + 


f + j 


Pi- 


(17) 


From (96) 


(a;2) = =YY = kl{B)' 

\ i^l / 2^1 j = l 


,(T3) + 4cy^(r)3 + 6(T)- 


(t3) = |cU^ + ^(T)2 + (T)3, 

rv fC 


3(T) 


(18a) 


(18b) 


where (T^) is the third-order moment of the cell-cycle time. Using (18) and the mean protein count 


quantified in (|14[), we obtain the following coefficient of variation squared 


CV^ = 


27 


-h 


4 ( 9 - 9 - 6CU| - 7CV^^ 


27{z + cv:}y 


(19) 


which represents the noise contribution from random cell-division events. Since cell-division is a global 
event that affects expression of all genes, this noise contribution can also be referred to as extrinsic 


49 63 -66 . In reality, there would be other sources of extrinsic noise, such as, fluctuations in the 


gene-expression machinery that we have ignored in this analysis. 

Note that CUj —>■ 1/27 as T approaches a delta distribution, i.e., cell divisions occur at fixed time 
intervals. We discuss simplifications of (19) in various limits. For example, if the time taken to complete 
cell-cycle is lognormally distributed, then 


{JY 

(T)3 


= (l + CU/)' 


CVi = — 

^ 27 


4 {2lCVj 


20CVy 


9CVY 


27 (3 -h CVY 


( 20 ) 



























and extrinsic noise monotonically increases with CV^. If fluctuations in T around (T) are small, then 
using Taylor series 




Substituting (21) in (19) and ignoring and higher order terms yields 


cvl 


1 28C'y^ 

—^- - 

27 81 


( 21 ) 


( 22 ) 


where the first term is the extrinsic noise for CV^ —>■ 0 and the second term is the additional noise due 
to random cell-division events. 


4.2 Contribution from partitioning errors 

Next, we consider the model illustrated in Fig. 2C with both random cell-division events and partitioning 
of protein between the mother and daughter cells. Thus, the protein noise level here represents the 
contribution from both these sources. Analysis in Appendix C shows that the time evolution of {x^) and 
{x^gij) are given by 


d{x^) 


dt 

djx^gn) 

dt 

djx'^gij) 

dt 


— 2fca 


{B){x) + ^ak , 


= 2ka,{B){xg,i) + x'^gjj \ -K -akp^ I ^ xgjj \ - k{x^gii), 


\i=i 




— 2‘kxi^B') ixgij') k{x gij)kix j — { 2 ,...,?}. 


(23a) 

(23b) 

(23c) 


Note that (23a)-(23b) are slightly different from their counterparts obtained in the previous section 
(equations (15) and (16a)) with additional terms that depend on a, where a quantifies the degree of 
partitioning error as defined in ([5| . As expected, (23) reduces to (15)-(16) when a = 0 (i.e., deterministic 
partitioning). Computing (x'^gij) by performing a steady-state analysis of (23) and using a similar 
approach as in (18) we obtain 


{x^) = 


(T^) +ACV^{T)^ + 6{T)^ 2akx{B){T) 
3(T) 3 ■ 


Finding CV'^ of the protein level and subtracting the extrinsic noise found in (19) yields 


CV^ = 


4Q! 


1 


3(3 + Cy|) (x) ’ 


(24) 


(25) 


where CV^ represents the contribution of partitioning errors to the protein noise level. Intriguingly, while 
CyJ increases with a, it decrease with CV^. Thus, as cell-division times become more random for a 
fixed (r) and (x), the noise contribution from partitioning errors decrease. 
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4.3 Contribution from stochastic expression 

Finally, we consider the full model in Fig. 2A with all the three different noise sources. For this model, 
moment dynamics is obtained as (see Appendix D) 


d{x^) 

dt 


= + 2k^{B){x) + ^ak ^ ^ 

d{x‘^g^l) k^{B'^)p, , ^ , k /2 \ , 1 , / 

-^-+ 2k^{B){xg^i) + ^P^i^2^^x gjj I + -akpi 


(26a) 


X9jj ) - k{x'^gii), (26b) 


d{x‘^gij) _ kx{B^)pi 


dt 


+ 2ka:{B){xg^j) - k{x^gij) + k{x^g(i_i)j), j = {2, ...,*} . 


(26c) 


Compared to (23), (26) has additional terms of the form kx{B'^), where (i?^) is the second-order moment 
of the protein burst size in ([^. Performing an identical analysis as before we obtain 


^ (T3)+4CK2(T)3 + 6(r)3 ^ 2akx{B){T) , kx{B^){T){3CV^ + 5) 

' 3(T) 3 2 


(27) 



Mean protein level per cell 


Figure 4. Scaling of noise as a function of the mean protein level for different mechanisms. 

The contribution of random cell-division events to the noise in protein copy numbers (extrinsic noise) 
is invariant of the mean. In contrast, contributions from partitioning errors at the time of cell-division 
(partitioning noise) and stochastic expression (production noise) scale inversely with the mean. The 
scaling factors are shown as a function of the protein random burst size B, noise in cell-cycle time (CV^) 
and magnitude of partitioning errors quantified by a (see (§)■ With increasing mean level the total noise 
first decreases and then reaches a baseline that corresponds to extrinsic noise. For this plot, B is assumed 
to be geometrically-distributed with mean [B) = 1.5, CV^ = 0 and a = 1 (i.e., binomial partitioning). 
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which yields the following total protein noise level 


Partitioning noise (CV^ 


cv^ =cvi + cvl + cv^ = cvl + 


Production noise {CVp) 

+ s (S2) 1 

3(3 + C142)p +3(3 + C'F|) {B) 


4a 


1 


(28) 


Intr 


insic noise 


that can be decomposed into three terms. The first is the extrinsic noise CVj representing the contribution 
from random cell-division events and given by ( |19[ ). The second term CV^ is the contribution from 
partitioning errors determined in the previous section (partitioning noise), and the final term CVp is the 
additional noise representing the contribution from stochastic expression (production noise). We refer to 
the sum of the contributions from partitioning errors and stochastic expression as intrinsic noise. These 
intrinsic and extrinsic noise components are generally obtained experimentally using the dual-color assay 
that measures the correlation in the expression of two identical copies of the gene 
Interestingly, for a fixed mean protein level {x), CVp has opposite effects on i 
CVv monotonically decreases with increasing CVp, CVp increases with It turns out that in certain 


49 


and CV^. While 


^ CduCiO VVlLli V_y » 5 ^ CCtOC-u VVitli W V rp , 

cases these effects can cancel each other out. For example, when B — 1 with probability one, i.e., proteins 
are synthesized one at a time at exponentially distributed time intervals and a = 1 (binomial partitioning) 


cv^ = cvl -b 


3(3 


1 

+ 


iCV^ + 5 1 2 


1 

(a;) 


(29) 


■CV:^)\x) ' 3(3 + CF^")(x) 

In this limit the intrinsic noise is always 1/Mean irrespective of the cell-cycle time distribution T [^ . 
Note that the average number of proteins itself depends on T as shown in (14). Another important limit 


is CVp —?> 0, in which case (28) reduces to 


CV^ 



(30) 


Intrinsic noise 


and is similar to the result obtained in 38 for deterministic cell-division times and binomial partitioning. 


Fig. 4 shows how different protein noise components change as a function of the mean protein level 
as the gene’s transcription rate is modulated. The extrinsic noise is primarily determined by the 
distribution of the cell-cycle time and is completely independent of the mean. In contrast, both CVp 
and cvl scale inversely with the mean, albeit with different scaling factors (Fig. 4). This observation 
is particularly important since many single-cell studies in E. coli, yeast and mammalian cells have found 
the protein noise levels to scale inversely with the mean across different genes [67||70] . Based on this 
scaling it is often assumed that the observed cell-to-cell variability in protein copy numbers is a result of 
stochastic expression. However, as our results show, noise generated thorough partitioning errors is also 
consistent with these experimental observations and it may be impossible to distinguish between these 
two noise mechanisms based on protein CV^ versus mean plots unless a is known. 


5 Quantifying the effects of gene-duplication on protein noise 

The full model introduced in Fig. 2 assumes that the transcription rate (i.e., the protein burst arrival 
rate) is constant throughout the cell-cycle. This model is now extended to incorporate gene duplication 
during cell cycle, which is assumed to create a two-fold change in the burst arrival rate (Fig. 5). As 
a result of this, accumulation of proteins will be bilinear as illustrated in Fig. 1. We divide the cell- 
cycle time T into two intervals: time from the start of cell-cycle to gene-duplication (Ti), and time from 
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gene-duplication to cell-division (T 2 ). Ti and T 2 are independent random variables that follow arbitrary 
distributions modeled through phase-type processes (see Fig. S2 in the Supplementary Information). The 
mean cell-cycle duration and its noise can be expressed as 

(T) = (Ti) + (T 2 ), /3=^> CV^=P^CVl+{l-PfCVl, (31) 

where CV^ denotes the coefficient of variation squared of the random variable X. An important variable 
in this formulation is (3, which represents the average time of gene-duplication normalized by the mean 
cell-cycle time. Thus, (3 values close to 0 (1) imply that the gene is duplicated early (late) in the cell-cycle 
process. Moreover, the noise in the gene-duplication time is controlled via . 


Gene-duplication 



Figure 5. Model illustrating stochastic 
expression together with random gene- 
duplication and cell-division events. At the 

start of cell-cycle, protein production occurs in 
stochastic bursts with rate k^- Genome duplica¬ 
tion occurs at a random point Ti within the cell- 
cycle and increases the burst arrival rate to 2ka:. 
Cell-division occurs after time T 2 from genome 
duplication, at which point the burst arrival rate 
reverts back to kx and proteins are randomly par¬ 
titioned between cells based on Q. 


We refer the reader to Appendix E for a detailed analysis of the model in Fig. 5 and only present the 
main results on the protein mean and noise levels. The steady-state mean protein count is given by 


r) = 


fc,(B)(Ti)(4-/3- 


■ /SCF/ 


Ti) 


-kx{B){T2) (3 -/3 + (1 -/3)CF|J 


(32) 


and decreases with /3, i.e., a gene that duplicates early has on average, more number of proteins. When 
j3 = 1, then the transcription rate is kx throughout the cell-cycle and we recover the mean protein level 
obtained in ( |14| ). Similarly, when /3 = 0 the transcription rate is 2kx and we obtain twice the amount 
as in (141. As per our earlier observation, more randomness in the timing of genome duplication and 


cell-division (i.e., higher and values) increases (x). 

Our analysis shows that the total protein noise level can be decomposed into three components 


CV^ = CVl + CVi + CV^ 


(33) 


where CFj is the extrinsic noise from random genome-duplication and cell-division events. Given its 
complexity, we refer the reader to equation (100) in Appendix E2 for an exact formula for CFj. More¬ 
over, the intrinsic noise, which represents the sum of contributions from partitioning errors (CV^) and 
stochastic expression (CVp) is obtained as 


CVr + CVp = 


cvi 


cvi 


4a(2 - /3) 


1 ^ (10 -8P + 3r) -t 6(1 - PfCVS^ + (B^) 1 


3 ((/I 2 - 4/3 + 6) + + 2(1 - PrCVSJ {x) 3 - 4/3 + 6) + + 2(1 - PrCVl) (B) (x) ' 

(34) 

Note that for /3 = 0 and 1, we recover the intrinsic noise level in (28) from (34). Interestingly, for B = 1 


with probability 1 and a = 1, the intrinsic noise is always 1/Mean irrespective of the values chosen for 
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and (3. For high precision in the timing of cell-cycle events (CVti —t 0, CVt 2 —t 0) 


CV^ 


cvi. 


cvi 


cvi 


4 - 3(/3 - 2)2/32 


4a(2-/3) 1 (10 - 8/3-h 3/32) (^2) 


3(/32-4/3 + 6)' 3 (/32 - 4/3 + 6) (x) 3 (/32 - 4/3 + 6) (B) (x) ’ 


Extrinsic noise 


where mean protein level is given by 


— _ fc,(i3)(ri)(4-/3) 

\^/ ~ o 


Intrinsic noise 


+ fc,(B)(T2)(3-/3). 


(35) 


(36) 


We investigate how different noise components in (35) vary with /3 as the mean protein level is held 
fixed by changing kx- Fig. 6 shows that CVp follows a U-shaped profile with the optima occurring at 
/3 = 2 — ^/2 « 0.6 and the corresponding minimum value being « 5% lower that its value at /3 = 0. 
An implication of this result is that if stochastic expression is the dominant noise source, then gene- 
duplication can result in slightly lower protein noise levels. In contrast to CVp, CVp has a maxima at 
/3 = 2 — which is « 6% higher than its value at /3 = 0 (Fig. 6). Analysis in Appendix E5 reveals that 
CVp and CVp follow the same qualitative shapes as in Fig. 6 for non-zero CVp_^ and CVp^. Interestingly, 
when CVp = CVp , the maximum and minimum values of CVp and CVp always occur at /3 = 2 — y/2 
albeit with different optimal values than Fig. 6 (see Fig. S3 in the Supplementary Information). For 
example, if CVp_^ = CVp^ = 1 (i.e., exponentially distributed Ti and T 2 ), then the maximum value of 
CVp is 20% higher and the minimum value of CVp is 10% lower than their respective value for /3 = 0. 
Given that the effect of changing /3 on CVp and CVp is small and antagonistic, the overall affect of 
genome duplication on intrinsic noise may be minimal and hard to detect experimentally. 


6 Discussion 


We have investigated a model of protein expression in bursts coupled to discrete gene-duplication and 
cell-division events. The novelty of our modeling framework lies in describing the size of protein bursts, 
Ti (time between cell birth and gene duplication), T 2 (time between gene duplication and cell division) 
and partitioning of molecules during cell division through arbitrary distributions. Exact formulas con¬ 
necting the protein mean and noise levels to these underlying distributions were derived. Furthermore, 
the protein noise level, as measured by the coefficient of variation squared, was decomposed into three 
components representing contributions from gene-duplication/cell-division events, stochastic expression 
and random partitioning. While the first component is independent of the mean protein level, the other 
two components are inversely proportional to it. Key insights obtained are as follows: 


• The mean protein level is affected by both the first and second-order moments of Ti and T 2 . In 
particular, randomness in these times (for a fixed mean) increases the average protein count. 

• Random gene-duplication/cell-division events create an extrinsic noise term which is completely 
determined by moments of Ti and T 2 up to order three. 


The noise contribution from partitioning errors decreases with increasing randomness in Ti and T 2 ■ 
Thus, if {x) is sufficiently small and a is large compared to i3 in (34), increasing noise in the timing 
of cell-cycle events decreases the total noise level. 


• Genome duplication has counter intuitive effects on the protein noise level (Fig. 6). For example, 
if stochastic expression is the dominant source of noise, then doubling of transcription due to 
duplication results in lower noise as compared to constant transcription throughout the cell-cycle. 
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Figure 6. Contributions from different noise sources as a function of the timing of genome 
duplication for = 0. Different noise components in (35) are plotted as a function of 

/3, which represents the fraction of time within the cell-cycle at which gene-duplication occurs. The 
mean protein level is held constant by simultaneously changing the transcription rate k^- Noise levels 
are normalized by their respective value at /3 = 0. The noise contribution from partitioning errors is 
maximized at /3 ~ 0.6. In contrast, the contribution from stochastic expression is minimum at /3 ~ 0.6. 
The extrinsic noise contribution from random gene-duplication and cell-division events is maximum at 
P « 0.2 and minimum at /3 ~ 0.8. 


• For a non-bursty protein production process {B = 1) and binomial partitioning (a = 1), the net 
noise from stochastic expression and partitioning is always l/(a:), the noise level predicted by a 
Poisson distribution. 

We discuss our results on gene duplication in further detail and how noise formulas derived here can be 
used for estimating model parameters from single-cell expression data. 

6.1 Affect of gene duplication on noise level 

In this first-of-its-kind study, we have investigated how discrete two-fold changes in the transcription rate 
due do gene duplication affect the intercellular variability in protein levels. Not surprisingly, the timing 
of genome duplication has a strong effect on the mean protein level - (x) changes by two-fold depending 
on whether the gene duplicates early (/3 = 0) or late (/3 = 1) in the cell-cycle. In contrast, the effect of 
P on noise is quite small. As p is varied keeping {x) fixed, noise components deviate by « 10% from 
their values at = 0 (Fig. 6). Recall that these results are for a stable protein, whose intracellular copy 
number accumulate in a bilinear fashion. A natural question to ask is how would these results change for 
an unstable protein? 

Consider an unstable protein with half-life considerably shorter than the cell-cycle duration. This 
rapid turnover ensures that the protein level equilibrates instantaneously after cell-division and gene- 
duplication events. Let denote the protein decay rate. Then, the mean protein level before and after 
genome duplication is {x) = kx{B)/^x and {x) = 2kx{B) respectively. Note that in the limit of large 
^x there is no noise contribution form partitioning errors since errors incurred at the time of cell division 
would be instantaneously corrected. The extrinsic noise, which can be interpreted as the protein noise 
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level for deterministic protein production and decay is obtained as (see Appendix F) 


cvl = 


(1 - m 

(2-Pf 


(37) 


When /3 = 0 or 1, the transcription rate and the protein level are constant within the cell cycle and 
Cy| = 0. Moreover, Cy| is maximized at /3 = 2/3 with a value of 1/12. Thus, in contrast to a stable 
protein, extrinsic noise in an unstable protein is strongly dependent on the timing of gene duplication. 
Next, consider the intrinsic noise component. Analysis in Appendix F shows that the noise contribution 
from random protein production and decay is 


W = X 


1 f{B^ 


2 V {B) 


+ 1 


= , {x) = 

(x)’ 


k^{B){2-p) 

lx 


(38) 


While the mean protein level is strongly dependent on /3, the intrinsic noise Fano factor = CVp x {x) 
is independent of it. Thus, similar to what was observed for a stable protein, the intrinsic noise in an 
unstable protein is invariant of /? for a fixed {x). Overall, these results suggest that studies quantifying 
intrinsic noise in gene expression models, or using intrinsic noise to estimate model parameters (see below) 
can ignore the effects of gene duplication. Finally, note that the mean and noise levels obtained for an 
unstable protein are independent of the cell-cycle time T. 


6.2 Parameter inference from single-cell data 


Simple models of bursty expression and decay predict the distribution of protein levels to be negative 
binomial (or gamma distributed in the continuous framework) [71[|72| . These distributions are character¬ 
ized by two parameter - the burst arrival rate and the average burst size (i?), which can be estimated 
from measured protein mean and noise levels. This method has been used for estimating kx and (B) 
across different genes in E. coli [^[^. Our detailed model that takes into account partitioning errors 
predicts (ignoring gene duplication effects) 


4a 


1 


SCV^ + 5 (^ 2 ) 1 


+ CVrp) (^x) 3(3-I-CV/.) {B) (^x) 


Intrinsic noise = 


Using CVp <C 1 and a geometrically distributed B [50[|74[|76| , ( |39[ ) reduces to 

, . . . 4a 1 5 1 + 2{B) 

Intrinsic noise = — — H- — —. 

9 {x) 9 (x) 


(39) 


(40) 


Given measurements of intrinsic noise and the mean protein level, {B) can be estimated from (40) 
assuming a = 1 (i.e., binomial partitioning). Once (B) is known, kx is obtained from the mean protein 
level given by (14). Since for many genes (B) « 0.5 — 5 (^, the contribution of the first term in (40) 


is significant, and ignoring it could lead to overestimation of {B). Overestimation would be even more 
severe if a happen to be much higher than 1, as would be the case for proteins that form aggregates 
or multimers 33 . One approach to estimate both {B) and a is to measure intrinsic noise changes in 


response to perturbing {B) by, for example, changing the mRNA translation rate through mutations in 
the ribosomal-binding sites (RBS). Consider a hypothetical scenario where the Fano Factor (intrinsic 
noise times the mean level) is 6. Let mutations in the RBS reduces (x) by 50%, implying a 50% reduction 
in {B). If the Fano factor changes from 6 to 4 due to this mutation, then {B) = 3.6 and (a) = 3.25. 

Our recent work has shown that higher-order statistics of protein levels (i.e., skewness and kurtosis) 
or transient changes in protein noise levels in response to blocking transcription provide additional in¬ 
formation for discriminating between noise mechanisms 77 78 . Up till now these studies have ignored 


noise sources in the cell-cycle process. It remains to be seen if such methods can be used for separating 
the noise contributions of partitioning errors and stochastic expression to reliably estimate (B) and a. 
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6.3 Integrating cell size and promoter switching 

An important limitation of our modeling approach is that it does not take into account the size of growing 
cells. Recent experimental studies have provided important insights into the regulatory mechanisms 
controlling cell size 79 -81 . More specifically, studies in E. coli and yeast argue for an “adder” model, 


where cell-cycle timing is controlled so as to add a constant volume between cell birth and division 
82-84 . Assuming exponential growth, this implies that the time taken to complete cell-cycle is negatively 


correlated with cell size at birth. In addition, cell size also affects gene expression - in mammalian cells 
transcription rates linearly increase with the cell size 85 . Thus, as cells become bigger they also produce 


more mRNAs to ensure gene product concentrations remains more or less constant. An important 
direction of future work would to explicitly include cell size with size-dependent expression and timing of 
cell division determined by the adder model. This formulation will for the first time, allow simultaneous 
investigation of stochasticity in cell size, protein molecular count and concentration. 

Our study ignores genetic promoter switching between active and inactive states, which has been 
shown to be a major source of noise in the expression of genes across organisms 86 -95 . Taking into 


account promote switching is particularly important for genome duplication studies, where doubling the 
number of gene copies could lead to more efficient averaging of promoter fluctuations. Another direction of 
future work will be to incorporate this addition noise source into the modeling framework and investigate 
its contribution as a function of gene-duplication timing. 


Appendix 


A Mean of protein in the presence of cell-cycle variations 


Based on standard stochastic formulation of chemical kinetics 
2A coupled with phase-type distribution introduced in Figure 3 


96 9R, the model introduced in Figure 


contains the following stochastic events 


Event 

Reset 

Propensity 

Protein production 

x(?) x(0 -1- u 


Phase-type 

Evolution 

Si{j+i) (0 ^ Si{j+i) (0+1 

kgij, 

7 e{l,. 

Cell-division 


n 

/ e ,7z} 


Note that x+{ts) is protein level after division, characteristics of x+{ts) is related to protein level before 
division as shown in equation (5) of the main text. Whenever an event occurs, protein level and states of 
phase-type distribution change based on the stoichiometries shown in the second column of the table. The 
third column of table shows event propensity function f{x,gij), which determines how often reactions 
occur, i.e., the probability that an event occurs in the next infinitesimal time interval (t, t+dt] is /(x, gij)dt. 
Protein production is a stochastic event which happens in bursts, each burst generates B molecules where 
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-B is a general random variable with distribution 

ProbabilityjB = m} = p", u G {0,1,..., oo}. (41) 


The probability of having a burst in the time interval (t, t + dt] is k^p'^dt. Events related to time evolution 
of phase-type distribution happen with a constant rate k. Cell-division changes both the level of protein 
and states of phase-type. This event contains start of new cell-cycle, hence whenever this event occurs, 
the last state of phase-type distribution resets to zero, and a new cell-cycle which is sum of i exponentials 
starts with probability pi] protein count level also resets to x+{ts). The probability of cell-division and 
starting a new cell-cycle from state gn in the time interval [t, t -I- dt] is kpi 


Theorem 1 of 55 gives the time derivative of the expected value of any function ip{x,gij) as 


d{(p{x,gij)) 

dt 


= \ X fix,gtj) ) 

\ Events / 


(42) 


where Aip(x,gij) is a change in tp when an event occurs. Based on this setup, mean dynamics of protein 
can be written by choosing to be a; 


d{x) 

dt 

d{x) 

dt 


kx{B) 



kx{B) 



(43) 


where we replaced conditional expected value of x^ by x/2 based on relation between statistical properties 
of x+ and x shown in equation (5). 

Dynamics of {x) is not closed and depends to moments (xgjj), hence in order to have a closed set of 
equations we add new moments dynamics by selecting ip to be xgij. We do it in two steps: first we write 
the moment dynamics of (xgu) 


djxgii) 

dt 


kx{B){gii) + -pi (xgh) - kpi (xgl^) - k'^p, {xgn). 


2=2 


(44) 


In the equation (9) of the main text it has been shown that 

(5r,x™) = (5*,X™), nG{l,2,...}, 


thus the term (^xgfi) will simplify as 

(x5u) = {xgii), 

and the dynamics of {xgu) can be written as 

= kx;{B){gii) + ^pi (xgii) - k {xgn) . 

In the second step we write dynamics of the moments of the form {xgij) other than {xgn) 


d{xgii) 

dt 

d{xgij) 

dt 


kx{B){gii) + kpi 


^(2 + 2^*1 “ xgii)gjjj - k{xgii), 


kx{B){gij) - k{xgij) + k{xgi(j_i)), j G {2,.. .,i}, 


(45) 

(46) 

(47) 

(48a) 

(48b) 
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where dynamics of {xgn) can be written as 


djxgn) 

dt 


kx {B)pi 


+ kp,iY^ ^g^j \ +kpi 


f9ii9jj / k{xgii). 


\i=i 


The equation (10) in the main text shows that 


(49) 


{9^]9rqx'^) = 0 , if i ^ r or jV 9 , 

hence \9i^93^ — and equation (491 simplifies to 


(50) 


= kx{B){g,i) + - k{xg^i). (51) 

Further based on Figure 3 in the main text the probability of selecting a branch of i exponentials is 
Pi, and because all the transitions happen with a constant rate fc, hence mean of each of these i states is 


Pi 


(5b) = 


(52) 


Thus equations (47), (48b), and (51) can be compactly written as shown in equation (11). 


B Moment dynamics of hybrid model introduced in Figure 2B 

Stochastic hybrid system introduced in Figure 2B coupled with phase-type distribution contains the 
following stochastic events 


Event 

Reset 

Propensity 

Phase-type 

Evolution 


kgij. 

Cell-division 

gjjits)^^’ 
gil(ts)^gilM + ^ 

n 

7=1 


and deterministic protein production dynamics 

x = kx{B). (53) 

Time derivative of the expected value of any function (p{x,gij) for this hybrid system can be written 

I d<p{x,gij ). 


as 


55 


d{(p{x,gij)) 


dt 


( ^7’(a:,5b) X/(2^>5b) ) + (■ 

\ Events / ' 


dx 


-kx{B)), 


(54) 
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where the first term in the right-hand side is contributed from stochastic events and the second term is 
contributed from deterministic protein production dynamics. Based on this equation, the mean dynamics 
of the protein is calculated by choosing ip tohe x 


d{x) 

dt 


kAB) 


k 

2 


n 


i=i 


(55) 


which is the same as equation (43). In addition to mean, dynamics of {xgA are also equal to their 
equation in the previous section. 

The second order moment dynamics of protein can be expressed by choosing ip to be x'^ 


^(1) - x^'^ , (56) 

which can be simplified as 

= (57) 

In order to have a closed set of equations we select p to be of the form x^gij. At the first step we write 
moment dynamics of (x'^gu) 


= 2k^{B){xgii) + jpi (x^gh) - kpi (x'^gfi) - k^pi{x'^gn). 

i^2 

Based on equation (9) of the main text, the term {x'^gh') simplifies as 
hence dynamics of {x'^gn) will be 

= 2kx{B){xgii) -f {x^gii) - k{x^gii). 

In the second step, we write dynamics of moments (x'^gij) when gij ^ gii 
d{x^gri) 


dt 

d{x'^gzj) 


= 2kAB){xgii) -f kpi + ^gn - x'^gn^ gj^ - k{x^gii), 


= 2k^{B){xgij) - k{x g^ + k{x g(^-l)j), j = {2 ,... ,i} , 

where dynamics of {x'^gn) can be shown to follow 

dlx^gn) I r,\ I \ k /2 \ /■^ 2 \ 7 / 2 \ 

= 2kAB){xgii) + jPt\ 2 ^x gjj ) - —p, [ 2 _^x g^ig^j ) - k{x ga)- 


dt 


\j=i 


\j=i 


Based on equation (10) in the main text {^J2j=i x^gngjjj = 0, thus equation (62) simplifies to 
d{x g^i) ^ 2ka:{B){xg^i) + jp^ l^x^gjS - k{x^g,i)- 


dt 


\I=i 


(58) 

(59) 

(60) 

(61a) 

(61b) 

(62) 

( 63 ) 


Equations (60), (61b), and (63) can be compactly written as equation (16) in the main text. 
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C Moment dynamics of hybrid model introduced in Figure 2C 

Stochastic hybrid system introduced in Figure 2C coupled with phase-type distribution contains the 
following stochastic events 


Event 

Reset 

Propensity 

Phase-type 

Evolution 

^dy+i)h)^g,O+i)(0 + l 

kgij, 

ie{2,...,n}, 

yep,...,/-!} 

Cell-division 


n 

7=1 



/e {!,...,7z} 


and deterministic protein production dynamics 


^ (-^) • 


(64) 


Note that in this model x(t) is a continuous random variable, thus we also use a continuous distribution 
to describe x+(ts), however statistical properties of x+(ts) is still given by (5). For this model we still can 
use equation ( [54| to derive moment dynamics; equations describing time evolution of mean and (xgij) 
are the same as previous models, thus mean of protein for this model is equal to its value in Appendix A. 
The second order moment dynamics of protein can be written by choosing (p to be x^ in equation (541 


d{a 


dt 


= 2kAB){x) + k{^Y.[T + ^-' 


9jj 


(65) 


where conditional expected value of x'^ is substituted based on equation (5). Dynamics of (x^) can be 
simplihed as 




3k 

T 


i: 


X gjj 


( 66 ) 


The same as before we add dynamics of the form {x'^gij) to have a closed set of dynamics. First we add 
dynamics of {x'^gu) 


d{x gii) ^ 2k:^{B){xgii) + (xgn) -f jpi {x^gh) - kpi (x'^gh) - k'^pi{x^gn), 


dt ■ 4 

Based on equation (9) of the main text dynamics of {x^gn) simplifies to 


z=2 


= 2k^{B){xgii) + {xgii) + ^pi {x^gii) - k{x^gii). 


Now we express dynamics of moments (x'^gij) for gij ^ gn 

d{x^gii) 

= ^Kx'\-D)'\Xgil) -I- KPi \ 

= 2kx{B)[xgij) k(^x gij) k(^x g^i—i'jj ), j = {2,..., ij-. 


dt 

d{x‘^gij) 


, , , , / (x^ x^ ax ax 2 \ \ i 1 2 \ 

= 2 kx{B}{xg,i} + kp,{^ 2 ^^ -h —gn + ^ + - x gn j gjj j - k{x gn), 


(67) 


( 68 ) 


(69a) 


dt 


(69b) 
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where dynamics of {x^gn) can be shown as 


dix^gn) , ak / ■^ \ k / ■^ 

=2k^{B){xgii) + —Pi (^xgjj ) + jPt {Y1 


dt 


X gjj 


\i=i 


\i=i 


ak 


3fc 


(70) 


+ ~TP^ \^^9i^93j ) - -yP^ {J2^^9ii93j ) - k{x^g^l). 


\i=i 


\i=i 


Based on equation (10) in the main text = 0; (jy^=i ^9ii9jj'^ = Oj hence equation 

(70) simplifies to 


d{x g^l) ^ 2k^{B){xg^i) + ^pi l^xgjj\ + jPt ('^x'^gjj \ - k{x‘^gii). 


dt 


(71) 


\i=i 


\i=i 


Equations (66), (68), (69b), and ([7T|) can be compactly written as equation (23) in the main text. 


D Second and third-order moment dynamics of the full model 

Based on model introduced in Appendix A, second order moment dynamics of protein is expressed by 
choosing p to be in equation (42), 


d{x‘^) 

dt 


= k^{B^) + 2k^{B){x) + k T ~ 


(72) 


where conditional expected value of x\ is substituted based on equation (5). Dynamics of (x^) can be 
simplified as 


d{x^) 

dt 


ak 


3k 


= k^{B'^) + 2ka:{B) (a:) + ^ ( XI ^9n ) “ A” ( 51 ^^9j3 


(73) 


\i=i 




The same as before we add dynamics of the form (x'^gij) to have a closed set of moments. First we write 
dynamics of (x^gn) 

= k^{B'^)pi+2k^{B){xgii) + ^Pi{xg'^^) + jPi{x'^g'^^)-kpi{x'^gl^)-k'^Pi{x'^gii), (74) 


dt 


i=2 


Based on equation (9) of the main text dynamics of (x'^gii) simplifies to 

= k^{B'^)pi + 2ka;{B){xgii) + ^pi {xgn) + ^pi (x'^gii) - fc(x^gii). 
Next, dynamics of moments (x'^gij) when gij ^ gn can be written as 
d(x^5ii) k^{B'^)pi 


(75) 


dt 


■ 2kj:{B){xg^i) 


/ /x^ x^ ax ax 2 \ \ 1 1 2 \ 

+ kpi (^ 2 ^ + —gn + ^ + -^9ii -X giij g^j j - k{x ga), 


d{x^gij) k^{B'^)pi 


dt 


2kx{B) (^xgij) k{x gij)k(^x j — {2 ,...,?}. 


(76a) 

(76b) 
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where dynamics of {x^gn) can be shown as 


dt 


+ 2k^ (B) (xgn) + I ^ xgjj ) + jPi 


\j=i 


\i=i 


ak ^ \ 3k ^ 2 \ , / 2 X 

+ \ \ 2 : g.igjj ) - k{x g,i). 


(77) 


\i=i 


\i=i 


Based on equation (10) in the main text ^^guPjj'^ = 0 ^9ii9jj'^ = Oj hence equation 

simplifies to 


d{x‘^g,i) k^{B'^)pi 


ak 


dt 


+ 2k^{B){xg,i) + —p, ('^xgjj ) + -p, i^x^g^j ) - k{x^g,i). (78) 


\i=i 


\l=i 


Equations (73l, (75), (76b), and (78) can be compactly written as equation (26) in the main text. 


E Contribution of different sources of stochasticity in protein 
by taking into account gene-duplication 

We study the contribution of different sources of stochasticity by using models introduced in Figure 
SI. The cell-cycle time consists of two time intervals: the time interval before gene-duplication and 
the time after gene-duplication. These time intervals are modeled by using two independent phase-type 
distributions as shown in Figure S2. Based on phase-type characteristics mean of the states of the first 
phase-type (s^) and the second phase-type {gij) are 

Pi 

{sij) = j/3, f e {l,...,ni}, jG{l,...,f}, 

{9tj) = ji^- 13), f G {l,...,n2}, jG 

where /? is defined as 

^ Mean time interval before gene-duplication 
Mean cell-cycle time 

We start our analysis by deriving mean level of protein in the next section. 

E.l Mean of protein count level in the presence of gene-duplication 

After gene-duplication the amount of genes expressing a specific protein doubles. Thus the rate of protein 
production increases by a factor of two as shown in Figure SIA. This model coupled with phase-type 
distributions contains the following stochastic events 

Note that in the protein production event, before gene-duplication all the states gtj are zero thus propen¬ 
sity function will be After gene-duplication and before division, one of the states gij is one hence 

propensity function will be 2kxPu- time of gene-duplication, states of the first phase-type will reset to 
zero and state gn of the second distribution will be selected with probability p'; hence propensity function 
of gene-duplication event is fcip) ^n- the end of cell-cycle, states of the second phase-type will 

reset to zero and a new cell-cycle which is sum of i exponentials will be selected with probability pp, thus 
propensity function of cell-division event is k 2 Pi EpLi 9jj- 




m) 

(T) ■ 


(79) 


(80) 
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A) 

Stochastic 

Stochastic 

Stochastic 

Stochastic 



B) 

Stochastic cell-cycle, 
Stochastic gene-duplication, 
Deterministic partitioning. 
Deterministic production 


Gene-duplication 



Cell-division 


C) 

Stochastic cell-cycle. 
Stochastic gene-duplication, 
Deterministic partitioning. 
Deterministic production 


Gene-duplication 



Figure SI: Stochastic hybrid models for quantifying different sources of noise. Gene-duplication 
and cell-division times are random events. A) Protein production happens in random bursts with burst 
frequency k^. After gene-duplication event burst frequency doubles (2kx)- In the time of division proteins 
will be distributed between mother and daughter cells randomly, and the protein burst frequency will 
be kx again. B) Protein production is considered in a deterministic fashion, and after gene-duplication 
dynamics of protein production is multiplied by a factor of two, i.e., x = 2kx{B). In the division event 
proteins are distributed between mother and daughter cells equally. Thus the only stochastic events are 
duplication and division events. C) Protein is produced in a deterministic fashion, but in time of division 
protein levels in daughter and mother cells are random. Thus duplication, division, and partitioning are 
random events. 


Theorem 1 of 55 gives the time derivative of the expected value of any function ip{x, Sij,gij) as 

d{ip{x,s^j,g^j)) ^ ^ Aip{x,Sij,gij) X f{x,s^j,gij)\ , 

\Events / 


(81) 


where Aip(x, Sij, gij) is a change in ip when an event occurs. The first-order moment dynamic of this 
model can be expressed by selecting y; to be a; in equation (81) 


d{x) 

dt 


—kx{B) j 1 -b / gij 

\i=i j=i I 




(82) 


\i=i 


where conditional expected value of x+ is replaced from equation (5); by using equation (79) mean 
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Figure S2: Cell-cycle time consists of two time intervals: at the end of the first interval 
gene duplicates, and at the end of the second one cell divides. Two independent phase-type 
distributions are used to model cell-cycle time in the presence of genome duplication. The states of the first 
distribution are denoted by Sij, i = {1,..., rii}, j = {1,..., *}; transition between these states happens at 
a constant rate fci. The states of the second distribution are shown by G^-, * = {1,..., 712 }, j = {1,. ■. ,i}, 
and transition between these states occurs at a rate fe- 


dynamics can be simplified as 




dt 


(83) 


\i=i 


Mean dynamics is not closed thus we add dynamics of (xSij), i = ,ni}, j = {1,... ,i} and 

(xgij), i = {1,..., ni}, j = {1,..., i} to have a closed set of moment equations. These moment dynamics 
are simplified by using equations (5), (9), (10) and (79l as 




dt i ' 2 

d{xsij) _ kx{B)p'^l3 


\j=i 


dt 


ki(xSij ) -f ki{xSi(^j_i'^')^ j {2,... , 7 }, 




djxgij) _ 2k^{B)pi{l - /3) 
dt i 


- k2{xg^j) + k2{xg,^j_i)), j = {2,...,i}. 


(84a) 

(84b) 

(84c) 

(84d) 


In order to find the mean of protein, first we need to find the moments (xsij), z = {1,..., ui} , j = 
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Event 


Reset 


Propensity 


f 


Protein production x{t) x{t) + U 


k,Pu 


»2 i 


1 +ZZ^i 


!=1 j=l 


First phase-type ^ 1 


evolution 




y 

(/) 1 -^ /+ i ) (0 +1 


k,Sy, 

7 g{1,. 


s j. (t^ I — ^ 0, 

Gene-duplication , / n ^ 




Second phase-type SijiO ^ 

evolution g,. _ (0 ^ g. (0 + 1 


KSij, 

7g{1,...,z-1} 


Cell-division 





l G 


{1,..., *} and (xgij), i = {1,..., 712 } , j = {Ij ■ • • j *}• For calculating these moments we should calculate 
the term this term can be obtained by analyzing equation (83) in steady-state 

2kpB){2-p) 


k^{B){2 -I3) = ) => {J2^9j3 ) = 


(85) 




\i=i 


By having this term, we calculate (xsij) by recursion process: we start by calculating (xsn) by substi¬ 
tuting equation (85) in equation (84aI. In the next step we use the definition we derived for (xsn) to 
calculate {xsi 2 ) from equation (84b). We continue this process until we derive all the moments 


{xSij) = -b (2 - /3) ) , i = {1,... ,ni}, j = {!,.. .,i}. 


( 86 ) 


Now we need to calculate the moments (xgij), i = {1,..., n 2 } , j = {1,..., i}, thus we need the expression 
of the term from equation (86) we have the following 

2kpB) 


E 

\i=i 


XSii ) = 


ki 


(87) 


Substituting this term in equations (84c) and (84d) result in 


(xgij) = 


2kpB) 


P* (1 - /3)- + 1 , i = {1, ■ ■ ■ ,'^ 2 }, j = {!,■ • ■ ,0- 


( 88 ) 
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Note that 


y~! + yz yz 3*^—i ^ (2^) — (a; | y^ y^] sij +^ ^ 5^^ 

i=i i=i j=i 


^i=l i=l i=l j=l 

ni i n2 i 


(89) 


^ (2^)=yi yi +yi yi 

2 = 1^=1 i—1j—1 

Thus by adding all the term calculated here and using equation (7) mean of protein can be calculated as 


_ k,{B){T,){4-/3 + /3CVSJ 

\^) ~ o 


+ k,{B){T2) (3 - /3 + (1 - /3)CF|; 


(90) 
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E.2 Noise in protein connt level contribnted from cell-cycle time 

In order to calculate the noise contributed from cell-cycle time variation, the model introduced in Figure 
SIB coupled with phase-type distributions is used. This model contains following stochastic events 


Event 

Reset 

Propensity 

First phase-type 
evolution 

‘^/(y+l) (0 ‘^/(y+l) (0 + 1 


Gene-duplication 


iG ,n^} 

Second phase-type 
evolution 

^/o+i)(0^g/o+i)(0 + i 

Kgij^ 

7 g{1,. 

Cell-division 

x{t^)h^x{t^)l2, 

gjj{t,)^0. 

/G 


and deterministic protein production 


( "2 i \ 

l + ■ (91) 

i=i i=i J 

Theorem 1 of 155] gives the time derivative of the expected value of any function (p{x, Sij^gij) as 




dt 


= ( X! X /(x, Sij,g,j) 

\ Events 


d(p{x,g^j) 

dx 


(92) 


kx{B) I 1 + XI XI 5b 
i=l 3 = 1 


where the first term in the right hand side is contributed from stochastic events, and the second term is 
contributed from deterministic protein production. In this model, dynamics of (x), (xsij) and (xgij) are 
the same as equations (831 and (E.6), thus mean of protein, (xSij), and (xgij) will be equal to their value 
in previous section. Further, the second-order moment dynamics of protein can be added by selecting ip 
to be x^ in equation (92) 


^ = 2k^{B) [ (x) -f l^'^xg^j 


dt 


\i=i j=i 



(93) 


This equation is not closed thus we add dynamics of (x'^Sij), i = {!,...,ni}, j = {!,...,i} and 
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{x^gij), i = {1,..., 77 , 2 } , j = {1, to have a closed set of equations 

= ‘2kx{B){xsa) + ^Pi “ ki{x‘^Sii) 


djx^Sij) 

dt 


— ‘^kx {B} {xsij ) ki (x Sij 'j -\- ki(^x s^2—i)j}^ j — 


= ika;{B){xgii) + kip^ “ k2{x^gii), 


djx^grj) 

dt 


= Aka;{B){xgij) - k2{x'^g^j) + fc2(a;^ff(*-i)j), j = {2,... ,i} . 


(94a) 

(94b) 

(94c) 

(94d) 


In order to calculate noise we need to express (x'^Sij), and (x'^gij), which requires calculating the term 
term can be derived by analyzing equation (93) in steady-state 


Skn 


x^gjj \ = 2k^{B) (a;) + ( ^ XI ^9ij 


\i=i 


u=ii=i 


4feg(i3)^(Ti)((4-/3) + /?CyjJ ^ 16*2(5)2(T 2 ) ((3 - /3) + (1 - /3)CF|J 


(95) 


\i=i 


3*2 


3*2 


where in deriving this term we used equation (90) and we summed all the terms in equation (88). By 
having this term, we calculate (x'^Sij) by recursion process, we derive (x'^sn) by substituting equation 
(95) in equation (94a). In the next step we use the definition of {x'^sn) to calculate (x^s^) from equation 
(94b). We continue this process until we derive all the moments 


kl{Br{T,) ((4 - /3) + pcvl) , 4*2(5)2(52) ((3 -/?) + (!- P)CVl) 

[X Sij) — Pi ~r Pi 


3*1 

2*2(5)2 , + 

*1 


3*1 

, i = {l,...,ni}, j = {l,...,i}. 


(96) 


Expressing (x'^gij) requires calculation of the term (X^Jli which can be obtained from equation 

([^ as 


E 

\i=i 


X-Sjj > = 


4*2(5)2(51) ((4-/3) + /3CE^J 4*2(5)2(52) ((3 - ^) + (1 -/3)CE^J 


3*1 


3*1 


(97) 


Thus (x'^gij) can be obtained with a recursion process from equations ( |94c[ ) and ( 94d| ) 

4*2(5)2((5i)((4-/3)+/ICE^J^ ^ 4*2(5)2(52) ((3-/3) +(1-/3)CI/^J^ 

dij) — o 7„ Pi ' o Pi 


3*2 

8*2 (5)2 f(l-i3)f+j 
-i- P^ 


3*2 


, i = {!,..., 772}, j = { 1 ,...+}. 


(98) 


Note that Sj=i Y^j=i ^4ms the second order moment of protein can 

be derived by adding all the terms in equations (96) and (98). (x^) can be simplified by using equations 
(7) and (18b) in the main article as 

— , 2,.^2 4(^^f) + 16(T|) + 2(5)3(3(2 - fdf + (3^5- 3p)CVl + 8(1 - PfCVD 

(x2) = *^(5) — —”T(5)--- ■ 


( 99 ) 
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cvl = 


Finally, using the definition of CV^ results in noise of protein raised from cell-cycle time variations 

(4(r3) 16(T|))/(r)3 - 3(2 - 4/3 -t 

3 ((/32 - 4/3 -h 6) -e + 2(1 - 

-3/32(/32(-2 + -t (1 - pfCV^^)(2 - 12/3 -I- 3/32 -e ^(13'^CV^^ -t (1 - pfCV^^)) 

3 ((/32 - 4/3-e 6)-e/32CF2^ -e 2(1 -/3)2CK2J " 


( 100 ) 
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E.3 Noise in protein connt level contribnted from random partitioning 

In order to take into account noise caused by random partitioning of proteins between two daughter cells, 
we use the model shown in Figure SIC coupled with phase-type distributions. This model contains the 
following stochastic events 


Event 

Reset 

Propensity 

First phase-type 
evolution 

‘^/(y+l) (0 ‘^/O+D (0 + 1 

Ksij, 

7 g{1,. 

Gene-duplication 



Second phase-type 
evolution 

^/o+i)(0^.?/o+i)(0 + i 

Kgij^ 

7 g{1,. 

Cell-division 


ZG 


and deterministic protein production 

712 i 

x = K{B)^^gij. ( 101 ) 

i=i i=i 

Note that here x is a continuous random variable, hence x_|_ is also obtained from a continious distribution. 
Connection between statistical statistical moments of x and x+ is given by (5). 

For this model, (x), (xSij), and (xgij) are equal to their value in Section E.l and Section E.2. However, 
dynamics of (x^) and (x^s^i) are different 


d{x^ 
dt 

djx'^Sii) 

dt 


/ ^2 i 

= 2kx{B) I (x) -f / 'y ( y ( xgij 


= 2kx{B){xs^i) 


u=ii=i 


«2 , 
-rPi 









+ ^ak2p', {^^xgjjj - ki{x'^s,i), 


(102a) 

(102b) 


note that dynamics of {x^Sij), {x^gn), and {x^gij) are identical to equations (94b), (94c), and (94d). 
Similar to previous section, we start by deriving the term Analyzing equation (102a) in 

steady-state gives this term as 



4kl{B)^{T,){i4-/3)+/3CVSJ 

3k2 

16kl{Bf{T2){{3-ld) + {l-P)CVl) 
3k2 


2akx{B){2-P) 

3k2 


( 103 ) 





















30 


Substituting equation ( 103[ ) in equations ( 102b[ ) and (94b) results in 

kl{B)HT{) ((4 - /3) + PCVI) , 4fc2(B)2(r2) ((3 - /3) + (1 - P)CVl) 

[x Sij) Pi ~r r,7_ Pi 


3fci 


3/ci 


2kl{Bf //3j2 + (2-/3)A 2ak,i2-f3){B) , 


' 7 ^ 

ki 

i = m} j = i}. 


+ 


3fci 


(104) 


In the next step we derive moments (x^^gij); we start by calculating from (1041 


E 

\i=i 


X-Sjj ) = 


Akl{Bf{T,){{4-l3) + /3CV3J 

3ki 

4fc2(B)2(T2)((3-/3) + (l-/3)Cy|J 2k,{B){2-/3) 


(105) 


3fci 3fci 

By having this term, the moments {x'^gij} are derived by solving equations (94c) and (|94d) in steady-state 


Akl{B)^{{T,){i4-P) + pCVl)^ ^ 4kl{B)^{T^){i3-l3) + {l-P)CVl)^ 

\X Qij) — Pi \ Pi 


3^2 

8kl{Bf f{l-f3)f+j\ 2ak,{B){2-P) 


3fc2 


-Pi 


3^2 


-p^, 


i = {l,...,n 2 }, j = {l,...,i}. 


Note that 


(106) 


(a;2) = ( 


i=i i=i 


i—1 j — 1 i—1 j — 1 


hence the second-order moment is 

4(T3) -H 16(T|) -h 2(r)3(3(2 - /3)2 ^^{5 - 3p)CV^ + 8(1 - ) 


(A) =- 


3(0 


2ak,{B){2-P){T) 


(107) 


(108) 


Coefficient of variation squared gives noise raised from partitioning and cell-cycle variations, which sub¬ 
tracting equation (100) from results gives partitioning noise as 

4a(2 - /3) 1 


CV^ = 


3 ((/32 - 4/3 + 6) + (3^CVl +^2il - f3)^CVl) (x )' 


(109) 
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E.4 Noise in protein connt level contribnted from stochastic production 

In order to calculate the noise caused by stochastic birth of protein, we use the model introduced in 
Section C.l. For this model, moments dynamics of (a;^), {x^Sij), and {x^gij) can be written as 


d{x^) 

dt 


i=l j = l 


7 = 1 


= k^{B^}{2 - d) + 2k^{B) [ (x) + ) ) + jOtki ^ 9n 


^2 


d{x-^S.^) k^{B^)Pp[ ^ 2 \ , 1 , ,/^ 

-- = - -■ - +2k^{B){xsn) + —Pi 9jjI + ■^ak 2 Pi ^2^; 


j = l 
2 


X9j3 ) -ki{x Sii) 


d{x‘^Sij) _ k^{B^)l3pi 


dt 


+ 2kj:{B){xSij) - ki{x^Sij) + fei(x S(i_i)j), j = {2,... ,i} , 


dix^gn) 2k^{B'^){l-p)pi 2 \ ,, 2 i 

-- = - 3 - \- 4k^{B){xgii) + kipi j “ k 2 {x gn),, 

d{x gij) 2kx{B )(1 l3)pi , ax /■d\/ \ 7/2 \ .,/2 \ ■ rr) n 

-TT- = -^- 'r 4kx{B){xgij) - k 2 {x gij) + k 2 {x 3(,-i)j), 1 = {2, 


(110a) 

(110b) 

(110c) 

(llOd) 

(IlOe) 


The same as before we start by expressing the term ^‘^9jj)'> this term is calculated by analyzing 

equation (110a) in steady-state 


) = 

\i=i / 


Akl{B)^{Tj){{4-P)+)3CVSJ 2akx{B){2-/3) 

3k2 3/C2 ‘ 

16fc2(B)2(r2)((3-/3) + (l-/3)CF#J ^ 4kx{2-P){B^ 


+ 


3^2 3k2 

Substituting this term in equations (110b) and ( |110c| ) results in 

kl{B)^{Tj){{4-P)+/3CVl) , 4fcg(i3)^(T2)((3-/3) + (l-/3)CyjJ , 


(x'^Sij) =- 


3fci 


~Pi + ' 


3ki 


2kl{B)^ /pf + {2-P)j\ , 2akx{2-/3){B) , 


ki 

kx{B‘^ ) /^2 - /3 ^ aA „/ / 

ki 


+ ■ 


3fci 


3 ) ^77 * = {!,- ■ j = {!,• ■ ■ ,d- 


Similar to previous section, solving equations ( |110d ) and (IlOe) gives the (x'^gij) 

4kl{B)mTj){{4-l3) + pCVl) , 4fc2(i3)2(T2) ((3-/?) + (!-/3)CK2J 
" 3fc2 3fc2 


8kl{B)^ 2akx{B){2-P) 


-Pi 


3^2 


-Pi 


2kx{B^) fl + /3 


3 + (1 - /3)v ) Pi, i = {!,• ■• ,?^2}, j = {1,... ,j}. 


(Ill) 


( 112 ) 


(113) 
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Finally summing all the moments (x'^Sij), and {x^gij) results in {x^) as 


— 4(r3) + 16(T|) + 2(T)3(3a(-(/? + l)pCVl + a - 4) + + 12) 

(a;2) =- 


3(r) 


2ak,{B){2 - a){T) 




2-/3 


l + CV^ 


{Ti. 


■2k,{B^) 


1+/3 


+ (1 - /3) 


1 + cvl 


{T2)- 


(114) 


Steady-state analysis gives the noise from stochastic birth, random partitioning, and cell-cycle time 
variations. Subtracting noise of cell-cycle time and partitioning in equations (1001 and (109) results in 
noise caused by stochastic production of protein 


CVd = 


(10 -8P + 3/32) + 3/32CF|^ (B^) 1 

3 ((/32 - 4/3 + 6) + I3^CVI + 2(1 - P^CVl) (B) p' 


(115) 
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E.5 Effect of gene-duplication time in intrinsic noise 

We investigate how the noise contributions from random partitioning and stochastic expression (CEj 
and CVp terms in equation (34) of the main text) change as /3 is varied between 0 and 1. Results show 
that CVp and CVp follow the same qualitative shapes as reported in Fig. 6. There exists a /?* 

-J2(2CV^ +bCV^CV:^ +3CF2 +2CV^ +3CF| + 1) + 2Cy2 +4Cy2 +2 

0* = ^ " i 1 - - (116) 

+ 2cv^^ + 1 . ^ ; 

such that CVp is minimized and CVp is maximized when P = 13*. Note that when CVp^ = CVp^ = 0, 
/3* = 2 — \/2 as reported in the main text. The minimum value of CVp and the maximum value of CVp 
are given by 


CVi = 


{3CV^^ + 7) - ^2{2CV^^ + + l)iCV^^ + 2CV^^ + 1) + 7CVS^ + 3 (^2^ ^ 


3(cy|^ (CV^^ + 3) + 3cy|^ +1) 


(B) (x) 


(117) 


(118) 


respectively. Plots of /3* and optimal value of CVp and CVp as a function of CVp^ are shown in Fig. 
S4. Note that if noise in Ti is high and T 2 is deterministic then /3* shifts towards zero. Similarly, if noise 
in T 2 is high and Ti is deterministic then /3* shifts towards one. 


F Noise level in unstable protein 


Consider an unstable protein with sufficiently high degradation rate such that the protein level reaches 
steady-state instantaneously compared to the cell-cycle time (Fig. S4). Let r denote the time from the 
last division event, then 


{x\t < Ti) 


kx{B) 


{x\t > Ti) 


lx 

2kx{B) 

lx 


(119) 


where Ti is the time in which duplication happens. The mean level of an unstable protein can be 
calculated as 


(x) = (x|t < Ti)p(t < Ti ) + {x\t > Ti)p(t > Ti), 


( 120 ) 



Figure S3: Effect of gene-duplication on intrinsic noise level. Left: Value of (3 where CVp is 
minimized and CVp is maximized as a function of CVp_^. When CVp_^ = CVp ^, noise levels always reach 
their extrema at /I = 2 — y/2. Middle & Right: Extremum values of CVp and CVp as a functions of 
CVp^. Noise levels are normalized by their values at /3 = 0. 
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where p{t < Ti) and p(t > Ti) denote the probability of being in the time interval before and after 
gene-duplication. Using 

p(r<Ti)=/3, p(r>ri) = (l-^), (121) 


we obtain 


(x) = 


K{B){2-p) 

lx 


( 122 ) 


To compute the extrinsic noise component we consider deterministic protein production and decay. 
The second-order moment of x{t) is given by 




(x2|r < Ti) = 

(a:2|r > Ti) = (My ^ 

By using definition of CU^, extrinsic noise is 


{x^) = 


kx{B) 


2kx{B) 


(1-/3)- 


(123) 


CV^ = 


(1 - P)P 

(2 - /3)2 ’ 


(124) 


which is zero at /3 = 0,1 and reaches its maximum at /3 = 2/3 (Fig. S4). 

Next we compute the intrinsic noise component. If the protein decay is sufficiently high, the noise 
contribution from partitioning errors will be negligible because any errors will be instantaneously corrected 
due to rapid protein turnover. Noise raised from stochastic gene expression can be investigated by 
considering a model containing stochastic bursty production and stochastic degradation of proteins, 
where after gene-duplication the burst frequency doubles. Again assuming large enough {x‘^\t < Ti) 
is equal to the steady-state second-order moment of a stochastic model with burst frequency k^ (analyzed 
in (^) 


(x2|r < Ti) = 


kx{B) 


lx 


‘^ll 


kx{B) 

^Ix 


(125) 


In comparison with equation (123), there are two extra terms at the right hand side of (a;^|r <Ti). 
The first extra term is due to production of protein in random bursts and the second one is due to 
stochastic degradation of protein molecules. Further for the same reasons (large degradation rate and 
rapid equilibration of the distribution), (x^|t > Ti) is equal to the second-order moment of a model 
containing stochastic bursty production of proteins with burst frequency 2kx: which is 


(x^\t > Ti) = 


2k^{B) 


lx 


+ 


kxiB"^) , kx;{B) 


lx 


lx 


(126) 


Thus the second order moment of an unstable protein can be written as 




lx 


+ 


2kx{B) 

lx 


kx{B^). 

(1-/3) + 


kx{B) 
2lx ' 


kx{B^) 

lx 


(1-/3) + 


kx{B) 

lx 


(127) 


(1-/3). 


Using definition of CV^ and subtracting extrinsic noise we obtain the following noise contribution from 
stochastic expression and decay 


CV^ = I 


<^ViU. 


(B) 




(128) 
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Gene-duplication time / Cell-cycle time 


Figure S4: Contribution of gene duplication to noise levels of an unstable protein, left: For a 
stable protein, copy numbers accumulate in a bilinear fashion. In contrast, an unstable protein reaches 
equilibrium rapidly and its level changes in steps. Right: Extrinsic and intrinsic noise predicted for an 
unstable protein as a function of /3. 
estimates from 20, 


Solid lines are predictions from (124) and (128), 
Carlo simulations. 


which agree with 
and a geometric 


, 000 Monte Carlo simulations. Parameters taken as jx = I0hr~ 
burst with (B) = 6. Burst frequency is changed to have a constant mean protein level of 100 molecules 
for different values of /3. 95% confidence intervals are calculated via bootstrapping. 
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